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(57) Abstract 

DNA which encodes the polypeptide 
streptavidin has been isolated as a fragment 
2kb in length derived from a restriction endon- 
uclease digestion of the chromosomal DNA of 
Streptomyces avidiniL The nucleic acid se- 
quence of the gene and the amino acid se- 
quence of the polypeptide have been deter- 
mined. A fused gene has been prepared which 
comprises the streptavidin gene fused to a 
gene encoding the human LDL receptor. Ex- 
pression of the gene fusion results in a fused 
streptavidin-human LDL receptor polypep- 
tide. Methods are provided for using the fused 
gene to produce labeled, chemically modified 
proteins in vivo and to isolate a protein know- 
ing only the nucleotide sequence of the gene 
encoding the protein. 
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DNA ENCODING STREPTAVIDIN, STRE PT AVI D IN PRODUCED THERE- 
PROM, FUSED POLYPEPTIDES WHICH INCLUDE AMINO ACID SE- 
QUENCES PRESENT TN STREPTAVI DIN AND OSES THEREOF 

5 Background of th<* Tnv*nHnn 

Certain embodiments of the invention described herein 
were made in the course of work under Grant No* GM 
14825-19, from the National Institutes of Health, U.S. 
Q Department of Health and Human Services. The U.S. 
Government has certain rights in this invention. 

Throughout this application various publications are 
referenced by arabic numerals within parentheses. Full 
citations for these references may be found at the end 
of the specification immediately preceding the claims. 
The disclosures of these publications in their entire- 
ties are hereby incorporated by reference into this 
application in order to more fully describe the state 
3 of the art as known to those skilled therein as of the 
date of the invention described and claimed herein. 

Streptavidin, a protein produced by Strept omyces avid- 
inii/ forms a very strong and specific non-covalent 
complex with the water-soluble vitamin biotin. 
Streptavidin was discovered in 1963 (1) as part of an 
antibiotic system in culture filtrates of several spe- 
cies of Strept omy CPA- Later Chaiet and Wolf (2) estab- 
lished its chemical nature and determined its amino 
acid composition. Streptavidin is a nearly neutral 
60,000 dalton protein. It consists of 4 identical 
subunits each having an approximate molecular weight of 
15,000 daltons. Streptavidin binds 4 molecules of 
biotin per molecule of protein, and it is free of car- 
bohydrate, Avidin, a basic glycoprotein usually iso- 
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lated from chicken egg-whites, shares with streptavidin 
some common characteristics such as molecular weight, 
subunit composition and capacity to bind biotin and 
forming a complex with biotin of very high affinity 
(Kq^IO" 15 ) (3-4) . Streptavidin and avidin have differ- 
ent amino acid composi t ions, but both have an unusually 
high content of threonine and tryptophan. Although 
streptavidin and avidin (derived from egg-white) bind 
biotin with equally high affinity, streptavidin has the 
advantage of avoiding much of the undesirable, nonspe- 
cific binding associated with avidin at physiological 
pH. The reasons for this are: 1) the isoelectric 
point of streptavidin is close to neutral, that of 
avidin is 10 (thus avidin is positively charged at pH 
7,0); and 2) streptavidin contains no carbohydrate, 
while avidin contains approximately 7% carbohydrate. 

At present, commercial preparations of streptavidin 
made by growing S* avidini i have several disadvantag- 
es: they are high in cost and are frequently contami- 
nated with biotin, and, as a result do not have all 
four valences free for binding biotin,, Furthermore/ 
production of streptavidin from 2. avidini i yields only 
limited quantities of streptavidin • 

The present invention overcomes the disadvantages of 
present commercial preparations of streptavidin by 
providing an inexpensive source of streptavidin, which 
is essentially free of biotin contamination* and has 
all four valences free for biotin binding. The present 
invention contemplates vectors which can produce 
streptavidin in large quantities. Furthermore, im- 
proved strept avidins may be produced by site-directed 
mutagenesis. 
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There have been attempts in the past to devise methods 
for labeling and detecting small amounts of interesting 
proteins within living cells* Past methods have in- 
cluded fusing to genes encoding the interesting pro- 
teins a prokaryotic gene, e.g. the gene for beta- 
galactosidase. Expression of the resulting fused gene 
results in a fused polypeptide, e.g. one containing the 
amino acid sequence from beta-galactosidase which can 
be used for stabilization and isolation of the protein 
of interest* However, such methods could not be used 
to produce labeled proteins in vivo- 

The present invention provides a method of generating 
labeled proteins la vivo * without the need for^in ^.IVO 
covalent chemical modification. The present method 
utilizes a marker protein which may be non-covalently 
attached to a tag which remains with the protein. This 
method may be used to produce labeled proteins in v ivo 
or to isolate target proteins knowing only the struc- 
ture of the gene which encodes them. 

Biotin may be conjugated to a variety of biological 
molecules using the strong, specific biotin binding 
capacity of avidin or streptavidin. The fused gene of 
the present invention thus permits- the detection/ lo- 
calization or purification of proteins, carbohydrates 
and nucleic acids. 
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Summary of the Invention 

DNA which encodes the polypeptide streptavidin has been 
isolated as a fragment 2 kb in length derived from a 
restriction endonuclease digestion of the chromosomal 
DNA of Streptomycea avidini U This DNA has the nucleic 
acid sequence set forth in Figure 3. The 2 kb fragment 
contains the entire region encoding the streptavidin 
polypeptide/ a region encoding a signal peptide and 
the flanking region DNA which occurs naturally at the 
3 1 and 5' ends of the coding region* The DNA fragment 
has been introduced into a cloning vehicle which has 
been inserted into the genomic DNA of a bacterial host 
cell. 

This invention also provides a fused gene which com- 
prises a first DNA fragment encoding a target protein 
of interest fused to a DNA fragment encoding 
streptavidin, said streptavidin having a multiplicity 
of binding sites for biotin or biotin derivatives, 
wherein said fused gene is capable of expressing a 
fused protein in yjyQ when the gene is inserted into a 
suitable expression vector and introduced into a suit- 
able host cell. This fused gene may be used to produce 
labeled, chemically modified proteins in v ivo and to 
isolate proteins when one knows only the sequence of 
the gene encoding the protein. 

In accordance with the present invention a method for 
producing a labeled protein of interest in v ivo com- 
prises the following steps: 

a) ligating the DNA encoding the protein of inter- 
est to the DNA encoding streptavidin of the 
present invention thereby producing a fused 
gene; 
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b) inserting the fused gene into a suitable ex- 
pression vector; 

c) introducing the expression vector into a suit- 
able host cell under appropriate conditions 
permitting expression of the fused gene and 
production of the fused protein; 

d) isolating the fused protein; 

e) incubating the fused protein with biotin or a 
biotin derivative la vitro , thereby producing a 
fused protein-biotin complex wherein the biotin 
or biotin derivative is bound to the 
streptavidin portion of the fused protein; and 

f) introducing the fused protein-biotin complex 
into the host cell of step(c) under appropri- 
ate conditions that allow the biotin or biotin 
derivative to bind with unlabeled fused protein 
produced by the host cell, thereby producing a 
labeled or chemically modified protein of in- 
terest in vivo. 

method of isolating a protein of interest comprises 
the following steps: 

a) ligating the DNA encoding the protein of inter- 
est to the DNA encoding streptavidin of the 
present invention thereby producing a fused 
gene; 
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b) inserting the fused gene into a suitable ex- 
pression vector; 
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introducing the expression vector into a suit- 
able host cell under appropriate conditions 
permitting expression of the fused gene and 
production of the fused protein; 

contacting the fused protein with biotin or a 
biotin derivative under conditions ^permitting 
the biotin or biotin derivative to bind to the 
streptavidin portion of the fused protein, 
thereby producing a fused protein-streptavidin- 
biotin complex; and 

isolating the complex and thereby isolating the 
protein of interest* 
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Brief Dogrripfcion of the Figures 
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Figure 1 depicts the amino- terminal amino acid sequence 
of streptavidin and the nucleotide sequences of two 
oligonucleotide probes used for the isolation of the 
streptavidin gene. (N: A,G,C and 0 or T) . 

i 

f 

Figure 2 depicts the partial restriction map of the 
cloned 2 kb-fragment (A) and strategy used for DNA se- 
quence analysis (B). The arrows indicate the direction 
and extent of the fragments sequenced. The shaded 
region corresponds to the coding sequence. (B: EaaHI, 
R: &SAI, S: £ajl3AI, M: flat I, A: AJLuI, Sm: Smal, K: 
Esnl, H: flafilll, T: Tac T) . 

Figure 3 depicts the nucleotide sequence of the gene 
for streptavidin and the restriction sites used for 
modification of the 5' and 3' regions. Above the 
nucleotide sequence is the amino acid sequence of the 
streptavidin protein. The amino acids of the signal 
peptide are indicated with negative numbers. 

Figure 4 depicts the amino acid sequence comparison of 
streptavidin and avidin. Identical residues are en- 
closed by solid lines and chemically similar residues 
by broken lines. Botr. sequences were aligned to give 
maximum homology. (Heterogeneity in residue number 34 
of avidin has been reported (25); He or Thr is present 
in this position) . 

Figure 5 depicts the comparison of predicted secondary 
structures of streptavidin and avidin. The sequences 
have been aligned as in Figure 4. : alpha-helix, B: 
beta-strand, T : turn. {The final 20 C-terminal amino 
acids of streptavidin were not analyzed). 
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Figure 6 depicts the restriction map of plasmid pCC8- 
S2. 

Figure 7 shows the steps and reactions carried out for 
the modification of the 5' region of the streptavidin 
gene. 

Figure 8 shows the reactions and steps carried out in 
the fusion of the streptavidin gene and the human LDL 
receptor gene. 
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The present invention provides isolated DNA which en- 
codes streptavidin. The DNA has been isolated as a 
fragment 2 kb in length, is derived from a restriction 
endonuclease digestion of the chromosomal DNA of Strep- 
tomyces ^yidini t and has the nucleic acid sequence 
identified in Figure 3- The 2 (c b fragment contains the 
entire region encoding the streptavidin polypeptide, a 
region encoding a signal peptide and the flanking re- 
gion DNA which occurs naturally at the 3 f and 5' ends 
of the coding region, 

A recombinant cloning vehicle is also provided which 
comprises cloning vehicle DNA and the 2 kb segment of 
DNA encoding the polypeptide streptavidin, wherein the 
2 kb segment is derived from the chromosomal DNA of 
Strept <?myces avidini i , said cloning vehicle DNA being 
characterized by the presence of a first and a second 
restriction enzyme site and the 2 kb segment being 
inserted into said sites. The 2 kb segment contains 
the entire region encoding the polypeptide 
streptavidin, a region encoding a signal peptide, and 
the flanking region DNA which occurs naturally at the 
3' and 5 f ends of the coding region. 

The cloning vehicle of the present invention may be of 
bacterial or viral origin. A suitable plasmid cloning 
vehicle is a pUC plasmid. A suitable phage cloning 
vehicle is the phage M13. 

The recombinant cloning vehicle of the present inven- 
tion has been inserted into a bacterial host cell. A 
suitable bacterial host cell is £. col 1 . a genetically 
engineered £. col i host cell containing the recombinant 
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cloning vehicle of the present invention has been pre- 
pared and is designated JH83 (ATCC Accession No* 
53307) . A method of preparing streptavidin comprises 
cultivating a genetically engineered host cell of the 
present invention under suitable conditions permitting 
expression of the streptavidin gene and recovering the 
streptavidin so produced. 

Substantially pure, biotin free streptavidin produced 
by recombinant DNA techniques comprises four identical 
polypeptide subunits f each having a molecular weight of 
about 16 , 500 daltons and a multiplicity of free biotin 
binding sites. The streptavidin subunits each have the 
amino acid sequence of Figure 3. The streptavidin of 
the present invention has a majority of its amino acids 
in the beta-conformation. 

The preferred number of f ree biotin binding sites is 
four. The free biotin binding sites are adjaceat to 
lysine residues which are at positions 80 and 121. The 
free biotin binding sites comprise critical tryptophan 
binding residues wherein the critical tryptophan bind- 
ing residues are at positions 21 , 79 or 120, and where- 
in the critical tryptophan binding residues are adja- 
cent to lysine residues. 

The polypeptide streptavidin may be prepared with an 
amino terminal label which is susceptible to proteo- 
lytic cleavage. The amino terminal label may be a 
radiolabel or a fluorescent label. Alternatively, the 
polypeptide streptavidin may be prepared with a carboxy 
terminal label susceptible to proteolytic cleavage, 
wherein the carboxy terminal label is a radiolabel or a 
fluorescent - label. The carboxy terminal label may 
also be an identifiable cysteine. 
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The present invention also provides a fused gene which 
comprises a first DNA fragment encoding a target pro- 
tein of interest fused to a second DNA fragment encod- 
5 ing streptavidin, said streptavidin having a multiplic- 
ity of binding sites for biotin or a biotin deriva- 
tive, and wherein the fused gene is capable of express- 
ing a fused protein in vivo when the gene is inserted 
into a suitable expression vector and introduced into 

10 a suitable host cell. The fused gene may have at its 
5' end either the first DNA fragment encoding the tar- 
get protein or the second DNA fragment encoding 
streptavidin. The DNA fragment encoding streptavidin 
of the fused gene is 2 kb in length, is derived from a 

15 restriction endonuclease digestion of the chromosomal 
DNA of Streptomyces avidinii and has the nucleic acid 
sequence of Figure 3. The 2 kb fragment contains the 
entire region encoding the polypeptide strept avidin, a 
region encoding a signal peptide and the flanking re- 

2o gion DNA which occurs naturally at the 3* and 5 f ends 
of the coding region. 

In one embodiment of the invention, the first DNA frag- 
ment is the gene encoding the human light density lipo- 

25 protein (LDL) receptor. Such a fused gene expresses a 
protein which consists of streptavidin at the N-termi- 
nal region of the fused protein and the LDL receptor 
protein at the C-terrainal region of the fused protein 
when the fused gene is inserted into a suitable expres- 

3 q sion vector and introduced into a suitable host cell. 
The fused gene may be cloned into a mammalian expres- 
sion vector which may then used to transfect a mammali- 
an host cell with the fused gene. A preferred mammali- 
an host cell is an NIH 3T3 cell. 
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An expression vector capable of expressing the fused 
gene of the present invention, when introduced into a 
suitable host cell comprises, suitable carrier DNA and 
the fused DNA fragments of the present invention* 
Suitable carrier DNA may be plasmid or phage DNA- The 
expression vector may be a bacterial or eucaryotic 
expression vector* Suitable bacterial expression vec- 
tors comprise a double-stranded DNA molecule, -which 
includes in 5' to 3' order the following: 

a DNA sequence which contains either a promoter or 
a promoter and operator; 



a DNA sequence which contains a ribosomal ^binding 
site for rendering the mRNA of the desired gene 
capable of binding to ribosotne's within the host 
cell; 



an ATG initiation codon^ 

20 

a restriction enzyme site for inserting a desired 
gene into the vector in phase with the ATG initia- 
tion codon; 

a DNA sequence which- contains an origin of replica- 

25 

tion from a bacterial plasmid capable of autonomous 
replication in the host cell; and 

a DNA sequence which contains a gene associated 
with a selectable or identifiable phenotypic trait 
and which is manifested when the vector is present 
in the host cell. 



35 



Also provided is a fused protein encoded by the fused 
gene of the present invention, wherein a target pro- 
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tein of interest is fused to streptavidin, wherein the 
streptavidin has a multiplicity of binding sites for 
biotin or a biotin derivative* In one embodiment/ the 
target protein is the human LDL receptor. In another 
5 embodiment, the target protein is a monoclonal anti- 
body. In a further embodiment, the biotin derivative 
is a fluorescent biotin. 

A method for producing a labeled protein of interest in 
Iq viv.Q comprises the -following steps: 

a) ligating the DNA encoding the protein of inter- 
est to the DNA encoding streptavidin of the 
present invention thereby producing -a fused 

15 gene ; 

b) inserting the fused gene into a suitable ex- 
pression vector; 

2o c ) introducing the expression vector into a suit- 

able host cell under appropriate conditions 
permitting expression of the fused gene and 
production of the fused protein; 

25 d) isolating the fused protein; 

e) incubating the fused protein with biotin or a 
biotin derivative in vitro , thereby producing a 
fused protein-biotin complex wherein the biotin 

30 or biotin derivative is bound to the 

streptavidin portion of the fused protein; and 

f) introducing the fused protein-biotin complex 
into the host cell of step(c) under appropri- 

35 ate conditions that allow the biotin or biotin 
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derivative to bind with unlabeled fused protein 
produced by the host cell, thereby producing a 
labeled or chemically modified protein of in- 
terest ixi vivo* 

5 

A method of isolating a protein of interest comprises 
the following steps: 
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a) ligating the DNA encoding the protein of inter- 
est to the DNA encoding streptavidin of the 
present invention thereby producing a fused 
gene ; 

b) inserting the fused gene* into a suitable ex- 
pression vector; 

c) introducing the expression vector into a suit- 
able host cell under appropriate conditions 
permitting expression of the fused gene and 
production of the fused protein; 

d) contacting the fused protein with biotin or a 
biotin derivative under conditions permitting 
the biotin or biotin derivative to bind to the 

. streptavidin portion of the fused protein, 
thereby producing a fused pr otein-str eptavidin- 
biotin complex; and 

e j isolating the complex and thereby isolating the 
protein of interest. 

The present invention provides a method of generating 
labeled proteins XQ y ivo and a method of isolating a 
target protein knowing only the nucleotide sequence of 
its gene* The basic concept is to fuse the gene encod- 
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ing a protein (target) of interest to the gene encoding 
another protein (marker) which has a binding site hav- 
ing a high affinity for a specific ligand, wherein a 
protein fusion is produced when the gene is expressed 
5 in vivo* In this manner the ligand binding site can be 
used to create a chemically labeled protein in vivo by 
the addition of appropriate modified ligands. The 
target protein can be any protein of interest. For 
example/ any protein of bacterial or viral origin can 
10 be a target protein if the nucleotide structure of the 
gene encoding the protein is known. In the present 
invention the marker protein is streptav idin. However, 
aequorin or any other protein having a high affinity 
ligand binding site may be a suitable marker protein. 

15 

Streptavidin binds biotin and many chemically modified 
biotins are available, such as fluorescent biotins r 
which also bind to streptavidin. Thus a gene fusion 
with the streptavidin gene allows the in vivo produc- 
2o tion of fused proteins which may be specifically la- 
beled with a fluorphore whenever desired, in vivo or in 
vitro* 

The present invention contemplates the production of 
25 labeled monoclonal antibodies. Such monoclonal anti- 
bodies would have a unique attachment site for a fluo- 
rescent dye. No covalent in vitro modification would 
be required- There would be no batch to batch varia- 
tion in the product. Also, the present invention con- 
3Q templates fusion labeled proteins to facilitate the 
isolation of rare or unstable proteins by making use of 
existing biot in-streptav idin affinity separation 
schemes . 

35 
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It is desirable to be able to label and detect small 
amounts of interesting proteins within living cells. 
The methods of the present invention enable the isola- 
tion of proteins knowing only the DNA sequence encoding 
them. The implications and applications are many* For 
example, in cells producing a labeled oncogene prod- 
uct, the cellular location of the oncogene product may 
be analyzed using the methods of the present invention* 
For specific genes which are turned on in only a few 
cells, these cells can be isolated and identified by 
FACS. 

A number of small proteins are easily detected in iiita 
either because they are chemiluminescent when the cor- 
rect cof actor is added or because they bind a small 
molecule with great specificity and affinity. The 
genes for these proteins may be cloned and placed into 
vectors that promote strong expression in mammalian 
cells. This system may be used to confirm the ability 
to detect the protein by the addition of cofactors or 
labeled small molecules. The vector may be altered to 
facilitate the construction of protein fusions. In- 
serting a gene for a cellular protein into the vector 
will result in a gene fusion. The vector containing 
fused genes may be reinserted into cells and the prop- 
erties, location, extent, and control of the in vivo 
synthesized fusion protein characterized. where suit- 
able mutants exist one will also be able to assess 
whether the protein fusion retains normal function. 
If necessary, a short collagen bridge may be construct- 
ed between the cellular protein and the labeled pro- 
tein. 
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EXAMFLE \ 

Isolation and Characterization of A Genomic DNA Clone 
Encoding Streptavidip _ 

5 

Materials and Methods 

Enzymes and Other reagents. All enzymes and chemicals 

used were from Bethesda Research Laboratories , New 
10 England Biolabs, Boehring Mannheim Biochemicals or 
Pharmacia P-L 'Biochemicals, Radiochemicals were from 
New England Nuclear, Str eptav idin, pDC8 and M13 were 
supplied by Bethesda Research Laboratories. 

15 Amino acid sequence and amino acid analysis . Analysis 
by SDS-polyacrylamide gel electrophoresis of the prepa- 
ration of streptavidin used showed, in addition to a 
main protein band, some material of lower molecular 
weight, possibly a degradation product of the protein, 

2o In order to obtain a pure component for amino acid 
- sequence analysis, the preparation of streptavidin was 
electrophoresed in a preparative 15% slab SDS-polyacry- 
lamide gel (9) and the main and higher molecular weight 
protein band was purified from the gel. Visualization 

25 of the protein bands, elution and SDS elimination were 
carried out essentially according to Eager and Burgess 
(10) . Amino terminal sequence analysis of the protein 
was performed using a Beckman 890B automatic sequencer. 
The identification of amino acids was carried out by 

3Q HPLC (11). For amino acid analysis, the gel-purified 
protein was hydrolyzed with 6 N HC1 in the presence of 
beta-mercaptoethanol (1:1000) at 110°C under vacuum for 
24 h, and the hydrolysate was analyzed on a Beckman 
121MB amino acid analyzer. 
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Synfchggjg. purification and labeling of oligonucleoti- 
de* Oligonucleotide mixtures were synthesized by the 
solid-phase phosphite triester method using an Applied 
Biologicals DNA/RNA synthesizer (12) * 

5 

The oligonucleotides were purified by preparative polya- 
crylamide gel electrophoresis on a 15% sequencing gel. 
The oligonucleotide probes used for the isolation of the 
streptavidin gene are depicted in Figure L 

Purified oligonucleotides were labelled at the 5' end 
with gainaa- [ 32 P] ATP (4,000-6,000 Ci/mmol) and polynucle- 
otide kinase* Unincorporated ATP was removed by chroma- 
tography on DEAE-cellulose (13). 

15 

Construction Of fchfi genQ flUC 1 ibrary from Streptomyces 

^Ll^laiio Purified chromosomal DNA from Streotomvces 
ay i din 11 was partially digested with Bh&l and the DNA 
fragments ranging between 6-19 Jtb were purified by 
2Q agarose gel electrophoresis- Charon 3 0 DNA (14) was 
digested to completion with fiaffiHI, the arms isolated by 
agarose gel electrophoresis and then ligated with the 
DNA fragments of stfreptQmyces avXtiXniX using T4 DNA 
ligase* The recombinant DNA was packaged An vitro into 
viable bacteriophage particles according to Maniatis et 
alo (15). 

S C K es ja An g q£ < ? l Qna s- JS^ zaXJ, LE 3 92 cells were 

infected with the recombinant phage s, plated in N2YCM- 
3Q soft agarose on NZYCM agar plates and grown at 37°Co 
Two plates containing approximately 8x10^ phages each 
were used for the screening,, Three replica plates were 
prepared for hybridization according to Benton and Davis 
(16). Filters were pre-hybr idized in 75 mH Tris-HCl pH 
8, 100 mM sodium phosphate pa 6 .5 , 750 mM NaCl, 5 mM 
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EDTA, 1% SDS, 10 x Denhardt and 100 micrograms per ml of 
denatured salmon sperm DNA for 3 h at 25°C* 

Hybridization was done in the same solution in the 
presence of 4 ng/ml of labelled probe (Stvl4, see Fig. 
1) at a specific activity of 10 8 -10 9 cpro per microgram 
of oligonucleotide. Filters were hybridized at 25, 28 
and 31°C (one replica at each temperature) for 30-36 h 
then washed at 25°C for 45 rain with three changes of 250 
ml of the same solution used for pre-hybridization 
except that Denhardt and DNA were oraited. Filters were 
■blotted dry and exposed to Kodax XR5 X-ray film with an 
intensifying screen. 

DHA sequenc e analysis . Restriction fragments of the 

gene were subcloned into M13, rapia and mpl9 (17) and 
sequenced by the dideoxy chain termination method (18) . 

Additionally, the streptavidin gene (2 kb fragment) was 
subcloned into the plasmid pUC8 , resulting in the forma- 
tion of a new plasmid designated pOC8-S2. A restric- 
tion map of the plasmid pUC8-S2 is depicted in Figure 6. 

The plasmid pUC8-S2 was used to transform coli strain 
K-12 resulting in new strain JM83, £. col i strain JM83, 
containing the plasmid pUC8-S2, has been deposited „ i 
the American Type Culture Collection, Rocicville, Md. , as 
ATCC No* 53307. This deposit was made pursuant to the 
Budapest Treaty On The International Recognition Of The 
Deposit Of Microorganisms For the Purposes of Patent 
Procedure* 
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SeCQ nda ry — structure prediction method . Computer pro- 
grams have been developed that compare the amino acid 
sequences of proteins to a series of sequence patterns 
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that have been shown to be characteristic of secondary 
structure elements in proteins of known tertiary struc- 
ture (19-21) . These patterns have been found to be 
approximately 90% accurate in identifying the turns that 
separate helices and beta strands (20). The patterns 
used to evaluate helical and beta propensities were 
taken from a study of alpha/beta proteins (19) augmented 
with other characteristics of all-helical and all-beta 
proteins (20), These patterns are clearly more reliable 
(ca. 70% correct) than the turn finding procedure. 
Extension of the methods to groups of sequences known to 
be closely related (e.g. myoglobins and immunoglobu- 
lins) did not degrade the reliability of the method 
(19). 

Results and Discussion 

Amino acid sequence of st r eptav idin . Amino- terminal 
amino acid analysis of a commercial preparation of 
streptavidin indicated the presence of both alanine and 
aspartic acid in the first cycle of Edman degradation of 
the protein. This heterogeneity can be explained by the 
fact that when this preparation was examined by SDS- 
polyacrylamide. gel electrophoresis/ two main protein 
bands with an approximate molecular weight of 17.5 and 
15.5 kd were observed. The higher molecular weight band 
accounted for 60-70% of the total stained protein mate- 
rial present in the gel. To determine the amino acid 
sequence, the 17 . 5 kd-polypeptide chain was gel purified 
as previously described in the Materials and Methods 
section. Figure 1 shows the amino acid sequence ob- 
tained for the 40 amino- terminal residues of the pro- 
tei-h. 
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Isolation of the clone containing the s treptavidin aeng . 
The approach used for the isolation of the clone con- 
taining the streptavidin gene was to screen a genomic 
library of Stcept omycea avidinii with a mixture of 16 
oligonucleotides, that represent all possible codon 
combinations for a small portion of the amino acid 
sequence of streptavidin (Fig, 1) . One specific probe 
14 nucleotides long was designated Stvl4. 

Several clones, which remained positive at the three 
temperatures of hybridization used (see Materials and 
Methods) were isolated. In order to confirm the pres- 
ence of the desired clone, purified DNA from each pre- 
sumptive positive clone was cut with BamHl, - the DNA 
fragments separated by agarose gel electrophoresis and 
analyzed by Southern blot technique (22) . In addition 
to Stvl4, another probe, Stvll (Fig, 1) which was de- 
rived from a different part of the amino acid sequence, 
was used* Both probes, Stvl 4 and Stvll, hybridized 
specifically to a single fragment of approximately 2 kb. 

The Southern blot analysis of the cloned DNA for 
streptavidin was accomplished by digesting the DNA from 
a positive clone with £amHI. The DNA fragments were 
subjected to electrophoresis on a 0.9% agarose gel, 
visualized by staining with ethidium bromide, and trans- 
fered to nitrocellulose filter paper by . the standard 
Southern blotting technique (22). Duplicate blots were 
hybridized with 20 ng/ml of 32 P-labeled Stvl4 or Stvll 
at 27°C for 20 hours. The hybridization solution and 
the washing conditions were the same used for the 
screening of the library. 
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Nuc l eotide — Sequence .analysis and amino acid sequence . 

In order to identify the region containing the comple- 
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mentary sequence of the probe, the 2 kb-fragment was cut 
with subcloned into g^aHI-cut Hi 3 and the recom- 

binants screened with 32 P-labelled Stvl4 probe* The DNA 
sequence obtained from isolated positive clones showed 
the presence of part of the coding region of the gene 
and the sequence complementary to both probes* To 
localize this sequence within the 2 kb-f ragment , a 
partial restriction map of the 2 kb-fragment was pre- 
pared using the method of Smith and Birnstiel (23) „ In 
order to obtain the complete nucleotide sequence of the 
gene, appropriate overlapping fragments were subcloned 
into Ml 3 and sequenced* Figure 2 shows the partial 
restriction map of the 2 kb-f ragment and the strategy 
used to sequence the streptavidin gene* 

The complete nucleotide sequence of the streptavidin 
gene along with the amino acid sequence is shown in 
Figure 3* The amino acid sequence of residues 1 to 40 
is in perfect coincidence with that obtained from the 
protein sequence shown in Figure 1. The amino-terminal 
amino acid of the protein isolated jj& vitro is aspartic 
acid, thus residues -24 to -1 must be post-translation- 
ally removed to yield this mature protein* The extra 24 
amino acids show common characteristics with those 
signal peptides present in the genes of most secreted 
proteins (24)* This finding is in agreement with the 
fact that streptavidin has been described as a secreted 
protein (1). After amino-terminal processing the mature 
protein contains 159 amino acids and has a calculated 
molecular weight of IS , 500 daltons, which is in close 
agreement with the value of approximately 17,500 daltons 
found for each streptavidin subunit by SDS-polyacryl a- 
mide gel elect rophoresiSo 
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A comparison of the following three different determina- 
tions of the amino acid composition of streptavidin is 
shown in Table 1: the amino acid composition as deduced 
from the nucleotide sequence of the streptavidin gene, 
the amino acid composition derived from analysis of the 
gel-purified protein and a previously reported amino 
acid composition (4) . 
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Table 1 

Amino g rid composition of streptavidin 

Residues per subunit 

Amino Acid a 

composition 
deduced from Amino acid 0 Amino acid c 

Amino nucleotide analysis analysis 

acid sequence (this work) (earlier work) 



10 



20 



Lys 


8 


His 


2 


Arg 


4 


Asp 


8 


Asn 


10 


Thr 


19 


Ser 


14 


Glu 


5 


Gin 


6 


Pro 


4 


Gly 


18 


Ala 


25 


Cys 


0 


val 


10 


Met' 


0 


lie 


4 


Leu 


8 


Tyr 


6 


Phe 


2 


Trp 


6 



8.7 


4 


2 .6 


2 


18.0* 


iJ: 


18.0 


12 


18.3 


19 


13.0. 


10* 


11.3 


?• 


11.3 




3 .7 


2 


20.6 


17 


25.0 


17 


0 


0 


10 .1 


7 


0 


0 


4 .0 


3 


8.5 


8 


6 .1 


6 


2.1 
4 .0* 


2 


8 



(a) . The composition of the mature protein after N- 
terminal processing is given. 

(b) The values were calculated from the amino acid 
analysis of the gel-purified protein. 

(c) The values were taken from reference (4). 

(*) Because acid hydrolysis of proteins results in 
deaxnination of asparagine and glutamine, these amino 
acids are not distinguished from aspartate and gluta- 
mate. 

(#} Tryptophan recovery was low since HC1 hydrolysis 
was employed (addition of Bet a-raercaptoethanol permitted 
some recovery of tryptophan) . 
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The values obtained from nucleotide sequencing are in 
good agreement with those obtained from amino acid 
analysis of the gel-purified protein within the error of 
amino acid analysis* The previously reported number of 
residues per streptavidin subunit was calculated assum- 
ing a total of 130 residues for the protein (4). Com- 
parison of these values with those obtained from the 
nucleotide sequence shows differences in several amino 
acids. This discrepancy cannot be explained by an 
underestimation in the total number of residues since 
some differences persist and others appear after correc- 
tion of the reported values for a total of 159 amino 
acids. It is interesting to point out that identical or 
similar values are found for those amino acids -that are 
absent or rarely present in the N- or C-terminal region 
of the processed streptavidin. In addition to this 
observation, a different commercial preparation of 
streptavidin showed a lower and variable molecular 
weight than the polypeptide that was used to determine 
the amino acid sequence. This suggests that the N- 
and/or C-terminal regions of the protein may be particu- 
larly susceptible to proteolytic degradation. Calcu- 
lations show that the 10-12 N-terminal residues plus the 
19-21 C-terminal residues account, approximately, for 
the discrepancy found in the amino acid content shown in 
Table 1. Therefore, it is believed that the previously 
reported amino acid analysis was probably obtained from 
a partially degraded streptavidin. 

Primary and secondary structu re comparison of 

streptavidin and avidin. Figure 4 shows the amino acid 
sequence of streptavidin compared with that of avidin 
(25) , the biotin-binding protein from chicken egg-white. 
Streptavidin has 159 amino acids compared with 128 for 
avidin. Several regions of extensive homology were 
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found between both proteins. Of particular interest is 
the homology around and including tryptophans 21 , 7 9 and 
120 of streptavidin. In avidin, the corresponding 
tryptophans 10, 7 0 and 110 are protected by biotin from 
5 oxidizing agents suggesting that these residues are 
implicated in the biotin-binding site of the protein 
(4)o Besides thijs, a unique NH 2 -9 rou P' probably one of 
the three lysine! residues (9, 71 and 111) which are 
adjacent to the tryptophans, has been found to be impor- 
10 tant for the biotin-binding activity of avidin (4) o In 
streptavidin, two of these three lysines are conserved 
as lysine residues (80 and 121) also next to trypto- 
phan So 

1S Secondary structures were calculated for both proteins 
using algorithms that predict conformation from amino 
acid sequence (19-21) o Figure 5 shows the residues at 
which alpha-helical, beta-strand or turn features are 
centered. Both proteins show a clear structural horaolo- 
20 gy with a high preponderance of beta-structure. The 
alternating hydrophobic, hydrophilic pattern for most of 
the suggested beta-strands is consistent with a folded 
beta-sheet or beta-barrel geometry (26) „ The overall 
composition pattern of both sequences suggests that both 
proteins fall in the family of "all beta 15 proteins (27) „ 
The list of turns shown in Figure 5 is incomplete but 
there is a good probability (19) that the assigned ones 
are correcto The extent and exact location of beta- 
structure is more difficult to predict o On the other 
hand it is clear there is little, if any, alpha-helix in 
both proteins. The best change for finding alpha-heli- 
ces is in the N-terminal region of streptavidin and the 
C-terminal region in both proteins* 
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In agreement with these predictions, avidin has been 
found to have a content of 55% of beta- structure and 5% 
of alpha-helix as determined by Raman spectroscopy (2 8) . 
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EX&MFLE 2 

Expression of a Streptavidin-Buman-LDL Receptor Gene 
Fusion In Mammalian Cells 

A gene construction fusing the streptavidin gene to the 
human Icy density lipoprotein (LDL) receptor gene so 
that their reading frames remained in phase yas made in 
such a way that the streptavidin gene was located at the 
5* end of the gene fusion and the human LDL receptor 
gene at the 3' end. The expressed protein consists of 
streptavidin at the N-terminal region and the LDL recep- 
tor at the C-terminal region of the hybrid protein* 

In order to fuse both genes, 11 codons of the 3* region 
of the streptavidin gene were deleted Jja vitr^ o The 
region of the LDL receptor gene used in the fusion was 
the region that codes for 159 amino acids of the C- 
terminal region of the protein- In the native receptor 
this region comprises a short extracellular tail (88 
amino acids) 7 the membrane spanning region (22 amino 
acids) and the intracellular domain (49 amino acids) o 

MR<U£*Cflt;*pn o£ the 5' and 3' region q£ t he atg ft ptayJLd^n 
g£H£o The nucleotide sequence of the streptavidin* (STV) 
gene and the restriction sites used for the modification 
of the 5 1 and 3' region are shown in Fig 3« 

Figure 7 shows the reactions carried out for the modifi- 
cation of the 5 f region of the STV gene* A 2 kb-frag- 
ment containing the STV gene (Fig- 7 , Step A) was treat- 
ed with £Ls£J and £anl and the resulting fragment con- 
taining the STV gene purified (Fig„ 7, Step B) * This 
fragment was modified by the addition of a synthetic 
oligonucleotide containing the sequence of the STV gene 
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eliminated by Hs£I treatment as well as a restriction 
site for the enzyme lanl placed immediately upstream of 
the initiation codon. The nucleotide sequence in the 
site of this modification is depicted below: 



Ar gLy s 

5 > TCTCAC ATG CGCAA G 3 ' 

3 1 AGAGTGTACGCGTTC 5 ' 



Mat I 



GCAAG- 
CGTTC- 



(STV) 



(STV) 



5 ' G CA TGGTACC ATG C 3 ' 
3 • CG TACCATGGT ACG 5 ' 



(Oligonucleoti de) 



DNA ligase 

/G CA TGGTAC C ATG CY GCAAG 

V^CG TACCATGGT ACG/ n CG TT C 

JKjmi 

ArgLys 

CATGCGAAG 

CATGGTA CG CGTTC 

Autoradiography of a sequencing gel verified the se- 
quence of the modified region. 

The modified fragment (Fig. 7, step C) was subcloned 
into pDC19 (Fig. step D) , treated with .Snai and the 
fragment containing the STV gene purified (Fig. 7, step 
E) . After modification of both ends with Ecor i linkers 
the fragment was treated with ainfill (Fig. 7, step F) 
and again modified by ligation of Sphi linkers (Fig. 7, 
step G) . The nucleotide sequence in the site of the 
modification of step G from Fig 7 is: 
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5 '-- 
3 ' — 



GlyValAsnAsnGly 
•GGCG_T_C2lACAACGCC-- 
-C CG CAG TTG TTG CG G~ 



Bine i i 



—3 1 
—5 • 



(STV) 



— GGCGTC 
- — C CG CAG 



5' Gfi£AI£CC 3» 
3' CCGTACGG 5 * 



DNA Ligase 



(5j2hl LinJcers) 



-GG CGTC/GGCATGC C\ 
-C CGCAG\C CGTACGG/ 



SphI 



GlyVal 

-GGCGTCGGCATG 
-CCGCAGCC 



Fusion of the STV gene with the LDL reepptor gene. The 
restriction map and nucleotide sequence of the human LDL 
receptor gene has been previously determined (29) . The 
restriction sites used for the fusion were the JScjaRI 
site, located at about 0.7 kb, the Spill site, located at 
about 2.1 kb, and the final site, located at about 2.8 
kb. 



30 



35 



Figure 8 shows the reactions carried out to fuse both 
genes. The plasmid containing the LDL receptor gene 
(Fig. 8, step A) was treated with j£cjaRI and SpJa.1. The 
fragment shown in Figure 8, step B was purified and used 
to insert the STV gene (Fig 8, step C) . The nucleotide 
sequence in the fusion site of both genes is: 
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GlyMetLeuLeu 

5* GGCATG CTGCTG 3' ( LDL receptor) 

3 ' CCGTACGACGAC 5 ' 



Sshl 



CTGCTG 

GTACGACGAC 



(LDL receptor) 



10 



15 



•GGCGTCGGCATG 
-CCGCAGCC 



DNA Ligase 



(STV) 



Gly val Gly Met Leu Leu 

-GGC GTC GGC ATG CTG CTG ( STV- LDL Recptor) 

-CCG CAG CCG TAC GAC CAC 



STV <- 



-» HLDL 



20 



this nucleotide sequence was confirmed by autoradio- 
graphy. 
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After recovering the STV-LDL receptor fragment by treat- 
ment with EcoR I and Sma T (Fig. 8, Step D) , the fragment 
was modified by the addition of Eco RI linkers (Fig. 8, 
Step B) . 

The modified fragment, or fused gene, was subcloned into 
pMV7, a mammalian expression vector and the resulting 
plasmid used to transfect NIH 3T3 cells using the calci- 
um phosphate precipitation method (31) . Colonies of 
cells resistant to the antibiotic G418 were examined for 
the expression of STV by means of the binding of red 
blood cells coupled to biotinylated bovine serum albu- 
min. After washing off the excess of red cells some of 
the colonies had bound red cells, which is evidence that 
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streptavidin fusion was expressed and transported to the 
cell membrane. 
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what is claimed is: 

1. Isolated DNA which encodes streptav idin. 

5 2. DNA of claim 1, about 2 kb in length and derived 
from a restriction endonuclease digestion of the 
chromosomal DNA of Streptomycea avidilU 1« 

3. DNA of claim 1 having the nucleic acid sequence set 
10 forth in Figure 3. 

4. DNA of claim 2 which comprises the entire region 
encoding the polypeptide streptavidin/ a region 
encoding a signal peptide and the flanking region 

15 DNA which occurs naturally at the 3 1 and 5* ends of 

the coding region. 

5. A recombinant cloning vehicle which comprises clon- 
ing vehicle DNA and the DNA of claim 2, the cloning 

20 vehicle DNA being characterized by the presence of 

a first and a second restriction enzyme site and 
the DNA of claim 2 being inserted into said sites. 

6. The cloning vehicle of claim 5, wherein the insert- 
25 ed DNA comprises the entire region coding for the 

polypeptide streptavidin, a region encoding a sig- 
nal peptide and the flanking region DNA which oc- 
curs naturally at the 3' and 5' ends of the coding 
region. 
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7. A plasmid cloning vehicle of claim 5. 

8. A phage cloning vehicle of claim 5. 
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9. The phage cloning vehicle of claim 3, wherein the 
phage is M13. 

10. The plasmid cloning vehicle of claim 7 , wherein the 
plasmid is a pUC plasmid. 

11. A genetically engineered bacterial host cell which 
comprises the cloning vehicle of claim 5. 

12. An £• col i host cell of claim 11. 

13. An cell host cell of claim 12 designated JM83 
and having ATCC Accession No. 53307. 

^ 14. A mammalian host cell which comprises the cloning 
vehicle of claim 5 . 

15. A mammalian host cell of claim 4 wherein the mamma* 
lian cell is a NIH 3T3 cell. 
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16. Substantially pure, biotin free streptavidin which 
comprises four identical polypeptide subunits f each 
having a molecular weight of about 16,500 daltons 
and having a multiplicity of free biotin binding 
sites. 

17. The streptavidin of claim 16, wherein each of the 
subunits has the amino acid sequence of Figure 3. 

18. The streptavidin of claim 16 having an amino termi- 
nal label which is susceptible to proteolytic 
cl eavage . 
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19. The streptavidin of claim 18, wherein the amino 
terminal label is a radiolabel or a fluorescent 
label . 
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20. The streptavidin of claim 16 , having a carboxy ter- 
minal label susceptible to proteolytic cleavage. 

5 21. The streptavidin of claim 20 , wherein the carboxy 
terminal label is a radiolabel or a fluorescent 
label. 

22. The streptavidin of claim 20, wherein the carboxy 
terminal label is an identifiable cysteine. 

10 

23. The streptavidin of claim 16 wherein a majority of 
the amino acids are in the beta-conformation. 

24. A method of preparing streptavidin which comprises 

15 

cultivating the host cell of claim 11 under suit- 
able conditions permitting expression of the gene 
encoding streptavidin and recovering the 
streptavidin so produced. 

20 

25. A fused gene which comprises a first DNA jf ragment 
encoding a target protein of interest fused to a 
second DNA fragment encoding streptavidin/ wherein 
the streptavidin has a multiplicity of binding 
sites for biotin or a biotin derivative, and where- 

25 

in the fused gene is capable of expressing a fused 
protein JLa vivo when the gene is inserted into a 
suitable expression vector and introduced into a 
suitable host cell. 

30 

26. The fused gene of claim 25 wherein the first DNA 
fragment is at the 5 1 end of the fused gene. 

27. The fused gene of claim 25 wherein the DNA fragment 
encoding streptavidin is at the 5' end of the fused 

35 

gene. 
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28. The fused gene of claim 25 wherein the DNA fragment 
encoding streptavidin is 2 kb in length and is 
derived from a restriction endonuclease digestion 

5 of the chromosomal DNA of Str eptomycgs avidi ni i , _ 

29. The fused gene of claim 25, wherein the 
streptavidin DNA fragment has the nucleic acid 
sequence of Figure 3* 

10 

30. The fused gene of claim 28, wherein the 2 kb frag-- 
ment contains the entire region encoding the 
polypeptide streptav idin, a region encoding a sig- 
nal peptide and the flanking region DNA w)iich oc- 

15 curs naturally at the 3' and 5 ' ends of the coding 

region. 

31. The fused gene of claim 25, wherein the first DNA 
fragment is the gene encoding the human LDL recep- 

20 tor protein. 

32. The fused gene of claim 31, which is capable of 
expressing protein that consists of streptavidin at 
the N-terminal region of the fused protein and the 

25 LDL receptor protein at the C-terminal region of 

the fused protein when the fused gene is inserted 
into a suitable expression vector and introduced 
into a suitable host cell. 

JO 33 * An ex P ress * on vector capable of expressing the 
fused gene of claim 2 5, when introduced into a 
suitable host cell, which comprises suitable carri- 
er DNA and the fused DNA fragments of claim 25. 
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34. A mammalian expression vector of claim 33. 
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35. A mammalian host cell which comprises the expres- 
sion vector of claim 34* 

36. An NIH 3T3 host cell of claim 35. 

37. An expression vector capable of expressing the 
fused gene of claim 31 , when introduced into a 
suitable host cell, which comprises suitable carri- 
er DNA and the fused DNA fragments of claim 31. 

38. A mammalian expression vector of claim 37. 

39. A mammalian host cell which comprises the .expres- 
sion vector of claim 3 8. 

40. An NIH 3T3 host cell of claim 39. 

41. A fused protein-, encoded by the fused gene of claim 
25 , wherein the target protein of interest is fused 
to streptavidin, and wherein the streptavidin has a 
multiplicity of binding sites for biotin or a bio- 
tin derivative. 

42. The fused protein of claim 41, wherein the target 
protein is a monoclonal antibody. 

43. The fused protein of claim 41, wherein the target 
protein is the human LDL receptor protein* 

44. The fused protein of claim 41, wherein the biotin 
derivative is a fluorescent biotin. 

45. A method for producing a labeled protein of inter- 
est jja vivo which comprises: 
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a) ligating the DNA encoding the protein of inter- 
est to the DNA encoding streptavidin of claim 1 
thereby producing a fused gene; 

b) inserting the fused gene into a suitable ex- 
pression vector; 

c) introducing the expression vector into a suit- 
able host cell under appropriate conditions 
permitting expression of the fused gene and 
production of the fused protein; 

d) isolating the fused protein; 

e) incubating the fused protein with biotin or a 
biotin derivative In vitro , thereby producing a 
fused protein-biotin complex wherein the biotin 
or biotin derivative is bound to the 
streptavidin portion of the fused protein; and 

f) introducing the fused protein-biotin complex 
into the host cell of the step(c) under appro- 
priate conditions that allow the biotin or 
biotin derivative to bind with unlabeled fused 
protein produced by the host cell, thereby 
producing a labeled or chemically modified 
protein of interest ±jx vivo - 

46. The method of claim 41 wherein the biotin deriva- 
tive of step (c) is fluorescent. 

47. A method of isolating a protein of interest which 
comprises: 
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a) ligating the DNA encoding the protein of inter- 
est to the DNA encoding streptavidin of claim 1 
thereby producing a fused gene; 

b) inserting the fused gene into a suitable ex- 
pression vector; 

c) introducing the expression vector into a suit- 
able host cell under appropriate conditions 
permitting expression of the fused gene and 
production of the fused protein; 

d) contacting the fused protein with biotin or a 
biotin derivative under conditions permitting 
the biotin or biotin derivative to bind to the 
streptavidin portion of the fused protein, 
thereby producing a fused pr otein-str eptavidin- 
biotin complex; and 

e) isolating the complex and thereby isolating the 
protein of interest. 
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Figure 1 



Amino acid sequence determined from the gel-purified protein 

1 10 
Asp Pro Ser Lys Asp Ser Lys Ala Gin Val Ser Ala Ala Glu Ala 

20 30 
Gly lie Thr Gly Thr Trp Tyr Asn Gin Leu Gly Ser Thr Phe lie 

40 

Val Thr Ala Gly Ala Asp Gly Ala Leu Thr 
Oligonucleotide probes used 

7 8 9 10 

Amino acid sequence Lys Ala Gin Val 

Possible codons ^ 5 T AAA GCN CAA GUN 3* 

G G 

Probe Stvll TTT CGN GTT GA 

C C 

21 22 23 24 25 

Amino acid sequence Trp Tyr Asn Gin Leu 

Possible codons 5' UGG UAU AAU CAA CUN 3' 

C C G UUU 
C 

Probe Stvl4 ACC ATA TTA GTT GA 

G G C A 
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Figure 3 



I 5* CCCTCCGTCCCCGCCGGGCAACAACTAGGG AGTATTTTT CGTGTCTCAC 

-20 - l0 

Met Arg Lys He Val Val Ala Ala He Ala Val Ser Leu Thr Thr 

50 AT G CGC AAG ATC GTC GTT GCA GCC AT C GCC GTT TCC CTG ACC ACG 

JUSLtl 1 

q <■ !*i ^er lie Thr Ala Ser Ala Ser Ala Asp Pro Ser Lys Asp Ser 

95 GTC TCG ATT ACG GCC AGC GCT TCG GCA GAC CCC TCC AAG G AC TCG 

10 20 

,/n Zll t l3 G1 ° Val Ser Ala Ala Glu Ala GJ y x l* T hr Gly Thr Trp 
140 AGG GCC CAG GTC TCG GCC GCC GAG GCC GGC ATC ACC GGC ACC TGG 

30 

ia< III ASD Gla LeU Gl7 SeC Thr Phe Ile VaI Thr A la Ala Asp 

185 T AC AAC CAG CTC GGC TCG ACC TTC ATC CTG ACC GCG GGC-GCC GAC 

40 50 

nn G J y Ala Leu Thr Tyr Glu Ser Ala Val Gly Asn Ala Glu 

230 GGC GCC CTG ACC GGA ACC TAC GAG TCG GCC GTC GGC AAC GCC GAG 

60 

Ser Arg Tyr Val Leu Thr Gly Arg Tyr Asp Ser Ala Pro Ala Thr 

275 AGC CGC TAC GTC CTG ACC GCT CGT T kc GAC AGC GcS IIg Gci ACC 

70 80 

„„ G J y Se . r G1 ? Thr Ala Leu G ly Trp Thr Val Ala Trp Lys Asn 

320 GAC GGC AGC GGC ACC GCC CTC GGT TGG ACG GTG GCC TGG AAG AAT 

90 

Asn Tyr Arg Asa Ala His Ser Ala Thr Thr Trp Ser Gly Gin Tyr 

365 AAC TAC CGC AAC GCC CAC TCC GCG ACC ACG TGG AGC GGC CAG TAC 

100 ll0 

Ain III Si y Gly Ala Glu Ala Ar 8 x le Asn Thr Gin Trp Leu Leu Thr 

410 GTC GGC GGC GCC GAG GCG AGG ATC AAC ACC CAG TGG CTG CTG ACC 

12~0 

III ^il Thr Thr Glu Ala Asa Ala Tr P L y s Ser Thr L eu Val Gly 

*-55 TCC GGC ACC ACC GAG GCC AAC GCC TGG AAG TCC ACG CTG GTC GGC 

500 Vxr rlr IV, 1*1 T ~ LyS Val LyS Pro Ser Ala Ala S « r "« Asp 
500 CAC GAC ACC TTC ACC AAG GTG AAG CCG TCC GCC GCC TCC ATC GAC 

1 50 

A i a A i a L y s L y s A l a G1 7 v al Asn Asn Gly Asn Pro Leu Asp Ala 
5 GCG GCG AAG AAG GCC GGC GTC AAC AAC GGC AAC CCG CTC GAC GCC 

Hind i 

Val Gin Gin Stop 
590 GTT CAG CAG TAG TCGCGT CCCGGCACCGGCGGGTGCCGGGACCTCGGCC 3* 
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Figure 5 
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Pigure 6 
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